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SCORING FORMULAS AND PROBABILITY CONSIDERATIONS 


ALEXANDER CALANDRA 
DEPARTMENT OF CHEMISTRY, BROOKLYN COLLEGE 


Bayes’ theorem of inverse probability is made the basis of a 
general equation for scoring objective examinations. The equation 
so obtained is evaluated by assuming a binomial distribution of ex- 
aminee knowledge and guessing tendency. A graphical illustration 
of the application of this equation to a hypothetical test situation is 
presented. The limitations inherent in the use of Bayes’ theorem 
make it inadvisable to recommend the practical use of the equa- 
tion unless future experimental evidence indicates an increase in 
scoring validity which more than compensates for the increase in 
scoring difficulty. 


During the past two decades a number of statistical studies have 
been undertaken for the purpose of determining the relative merits 
of the formulas which have been proposed for the scoring of objec- 
tive examinations. As a result of these studies, it has become evident 
that the formulas which seemed to be based on the soundest theoreti- 
cal considerations were hardly, if at all, superior to those of empirical 
origin. Now, while this apparent failure of the theory of probability 
to provide a reasonably superior approach to the development of 
scoring formulas has brought forth a number of explanations whose 
validity has always been considered an open question, there has been 
little or no attempt to re-examine the theoretical foundations of the 
formulas. This situation has prompted the writer to present the fol- 
lowing development of a general method of scoring based upon con- 
siderations which may be more appropriate than those suggested up 
to this point. 

The purpose in developing any scoring formula is the derivation 
of that statistical function which will give the best estimate of the 
number of correct responses which were not the result of guessing. 
The problem which this raises is of the general type dealing with 
events which are the result of the operation of known and unknown 
causes, and can be classified as “‘a priori” or “a posteriori” depending 
on whether it involves the calculation of the most probable nature of 
the event from a knowledge of the known causes, or the calculation 
of the most probable extent of the known causes from a knowledge 
of the event. 

Problems in “a priori” probability are amenable to simple solu- 
tions of unquestioned validity. Thus on the basis of the most elemen- 
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tary considerations the answer to the problem of determining the most 
probable raw score of a student who knows 50 items on a true-false 
test and guesses at an additional 30, is given by the expression 


50 + 30/2=65. 


Problems in “a posteriori” probability are often of considerable 
complexity and even when solvable sometimes yield ambiguous or 
even meaningless results. Thus the problem of determining the most 
probable value of the number of items known by a student who ob- 
tains a raw score of 25 correct items and 50 incorrect items yields a 
meaningless solution when it is assumed that the number of responses 
guessed correctly is equal to the number of responses guessed incor- 
rectly. Most scoring problems involve questions of “a posteriori” 
probability and the formulas which have been derived for their solu- 
tion are based on assumptions analogous to the one indicated above. 
It is possible however, to avoid such assumptions by an application 
of Bayes’ theorem of inverse probabilities* and to obtain a general 
solution which is applicable to a variety of objective questions. 

According to Bayes’ theorem the probability P,* that an event R 
was the result of a cause k is given by the expression, 


Pi 


(1) 


where 


P,,= the probability that the cause k exists. 
P.,= the probability that any cause z exists. 
P,2 =the probability that the operation of cause k will 
give rise to the event R. 
P.® =the probability that the operation of any cause z 
will give rise to the event R. 


The proof of (1) can be found in most textbooks dealing with prob- 
ability. 
By extending Bayes’ theorem it is possible to develop a general 
scoring formula in terms of the following definitions: 
n= the total number of items on the examination. 
R= the number of items the examinee answers correctly. 
W= the number of items the examinee answers incorrectly. 


* The use of Bayes’ theorem as a basis for scoring formulas was suggested 
by the writer in his unpublished doctorate dissertation at the School of Educa- 
tion of New York University. 

+ Fisher, A. The mathematical theory of probabilities. New York: The Mac- 
Millan Co., 1930. 
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y= the number of items the examinee attempts; 
y=R+W 
x= the number of items known to the examinee. 

P,"=} the probability that an examinee who knows x items ex- 
ists in the group taking the test. 

P™*= the probability that an examinee who knows «x items 
will guess at y—zx additional items. 

Py* = the probability that an examinee who guesses at y—x 
items will obtain (R—«x) correct guess responses, i.e., a 
total of R correct responses. 

ra the probability that a raw score of R correct items out 
of a total of y responses originated from a knowledge of 
x items. 

Bayes’ theorem can be used to calculate P, ’ by means of the substi- 
tutions 
P Rw 
2" = P; k 3 
Pp. PP — . 


These substitutions yield 


PZ (2) 


By the use of (2) and the theorem of the mean value, the score S, 
which can be assigned to an examinee who attempts y items and an- 
swers F correctly is given by the equation 


(3) 


According to equation (3) the best score which can be assigned to a 
paper containing R correct responses out of a total of y responses is 
the mean value of x for all such papers. 

Combining (2) and (3), 


Pre. Pre 
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The probabilities P,” , P"~ Pv can be evaluated by considering them 
to arise from binomial distributions generated by the corresponding 
probabilities p, , p., and the probability p; , which are defined as fol- 
lows, 
p, = the probability that an examinee knows any given item. 
~2 = the probability that an examinee will guess at any 
given item. 
~; = the probability that an examinee who guesses at an 
item will guess it correctly. 
These definitions make possible the following evaluations: 


n! 


(n—x)! (a)! ; 


P= C." (p.)? (i-p,)** 


aad (n—a—yt+ax)! (y—a)! ’ 


_ (y—«)! (ps)? * 


(y—R)! (R—-«x)! 
These expressions can be simplified by the use of the following sub- 
stitutions: 


Pi De Ds 
k,= = , and =——_, 
1-7, me 1—p, 1—p; 


yielding the relationships 
(1—p,)” 


(n—a)! (a)! ’ 


pre — (n—«)! (1—p.)" (he)” 


(n—y)! (y—x)! 
(y—R)! (R-=x)! 


As a result of the foregoing relationships, 
SP. -Prs- Pee 
n\(n—2x) !(y—x) !(1—ps) "(1)" (Hea) (Hea) (Hes) (D2) * 
(n—x) !(y—a) 1(n—y) !(y—R) 1(R—z) (x)! 


(5) 


In determining the value of S for any single examination paper, it 
should be noted that n,R,y, Do, Ds, (Ki), (ke), (ks) are all con- 
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stants and that only those functions containing x are variable. It is 
therefore possible to make the substitution in (5) 
(n—y)! (y—R)! 


K 


and obtain as a result 
(D2) (ps) * 


R 


But 
For the purpose of algebraic convenience, let 
D2 Ds 
Then 
pr (Z)* = (6) 
Expansion of 
R Zz 
=2 (R—2x) !(x)! 
1 Z Z? 


= 
R!0! (R-1)!1! (R—-2) !2! (R-38) !3! 


Multiplying by R!, 
ZR Z*R(R-1) Z°(R) (R-1) (R-2) 


QoR!=1+ + + 
Comparison of this series with the expansion of (a+b)” shows that 
(1+Z)* ; (7) 
__ (i+2)* 
Qo =2 (R—a)!(a)! +R! (8) 
Substituting (8) in (6), 
BP - (9) 


The numerator of (4) can be evaluated by an analogous process, all 
steps being identical, leading to the expression 
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By expansion, 
(R—x)'(z)! (R-1)!(1)! (R=3)1()T 
or (11) 
Z 


(R—-1) !(1) (R—-2)!(1)! (R—3) !(2)! (R—4) !(3)! 
and 


Qy(R-1) Z(R-1) Z?(R-1) (R-2) 


Z (1)! 2! 


But by comparison of the resulting series with the expansion of 
(a+b)", 


Qy(R-1)!__ 
(1+Z) 
and 
Z(1+Z)** Z(1+Z)*""*R 
= = . (12) 
(R-1)! R! 
Combining (4), (9), and (12) yields 
4R 
R 
i+ lz 1 | (13) 
Di 


In applying (13) to the scoring of any set of objective items, it is 
necessary to evaluate p,, p., p;. For a set of true-false questions 
this can be done as follows: 

Evaluation of 

In deriving (13) it was assumed that a binomial distribution of 
examinee knowledge existed. The value of p for this distribution can 
be estimated from the raw scores made on the examination by the 
group taking the test and is equal to : 
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_ SW 
’ 


where 
SR is the sum of all the correct responses by all the examinees, 


>W is the sum of all the incorrect responses by all the exam- 
inees, 
x is the sum of all the examination items on all the papers. 


Evaluation of p.: 
The derivation of (13) assumed that a binomial distribution of 


guessing tendency existed. The value of » for this distribution can be 
obtained from the raw scores of all students who anawered exactly 


R+W=K, items, and is equal to 


- the number of items guessed at 
ill the number of items available for guessing 
or 


__ thenumber of items attempted — the number of items known 
ila the number of items on the test — the number of items known ~ 


This can be estimated as 


_ (ZR) + (ZW) — — _ 
(=n) — (CER) — (2W)] (Sn) — (ZR) + (SW) 


where 
(SR) is the sum of all the correct responses made by 


all the examinees answering exactly K, items, 
and, corresponding definitions hold for the other 
terms. 


Evaluation of ps: 
From the nature of a set of true-false questions, 


p3=1/2. 
From these evaluations, and equation (13) 
s= R 
1+ || (Sn) — (SR) + GW) | 


The significance of equation (14) can be illustrated by its application 
to the following hypothetical test situation. A true-false test of 20 
items is submitted to a large group of examinees. It is desired to 
assign a score to an examinee who answers only 16 items and makes 


12 correct responses. 
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CurveI This represents the assumed binomial frequency distribu- 
tion of examinee knowledge. In this curve n=20, and p,, 
which should be calculated from the examination data, has 
been taken as 1/2. 

CurvelII This represents the frequency distribution of knowledge 
among those examinees who answer exactly K, items. In 
this curve K, is 16, and p., which should be calculated 
from the data, has been taken as 1/2. 

Curve III This represents the frequency distribution of knowledge 
among those examinees who answer exactly 16 items and 
obtain exactly 12 correct responses. The abscissa corre- 
sponding to the ordinate which bisects the area under 
curve II is given by S. This can be calculated from the 
graph or from the equation and is in both cases equal to 
9.8. 


Equations (4), (13), and (14) are presented chiefly as a matter of 
theoretical interest. The practical use of these equations is not rec- 
ommended unless adequate experimental evidence indicates an in- 
crease in scoring validity which more than compensates for the in- 
crease in scoring difficulty. The writer hopes to make available in a 
later paper a discussion of the practical and theoretical limitations 
of these equations. 

The writer takes pleasure in acknowledging his indebtedness to 
Edward E. Cureton, John C. Flanagan, Charles T. Molloy, and Paul 
V. West for their very helpful discussions of the subject matter in 
this article. 
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An illustration of an application of Bayes’ Theorem to a scoring problem. 
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THE PHI COEFFICIENT AND CHI SQUARE AS INDICES 
OF ITEM VALIDITY 


J. P. GUILFORD 
UNIVERSITY OF SOUTHERN CALIFORNIA 


Two new methods of item analysis are described. One involves 
the computation of the ¢ coefficient (correlation of a fourfold point 
distribution) and the other involves chi square. The only data re- 
quired are the proportions of passing individuals in the upper and 
lower criterion groups, for the determination of ¢, and in addition, 
N, for the determination of chi square. Abacs are presented for 
graphic solution of the two indices of validity, and tests of signifi- 
cance are provided. 


Many of the devices for determining item validity are based upon 
the principle of the correlation between an item and some criterion 
variable. It is also common practice to employ dichotomous classi- 
fications in both the item variable and the criterion variable, using 
the proportions of passing individuals in the high and low criterion 
groups. The two criterion groups, furthermore, are equal popula- 
tions—upper and lower quarters, 27 per cents, or halves of the en- 
tire criterion population. The two methods tc be described are appli- 
cable when this situation obtains. The one method, which‘ requires 
the computation of a ¢ coefficient, is based upon the correlation prin- 
ciple. The other method, which involves the computation of chi 
square, rests upon the principle of divergence of the proportions of 
passing individuals among the high and low criterion groups from a 
purely random or chance distribution. Both indices of validity have 
the logical advantage that they do not require the assumption of a 
continuous distribution in either the item or criterion variable al- 
though both are applicable when one or both distributions are con- 
tinuous.* There are times when the criterion variable in particular 
represents a division into two discrete classes, as for example the two 
sexes when items for masculinity-femininity are being validated. 


The Phi Test of Validity 
The customary formula for the computation of ¢ is given as 


. ..*I am indebted to Mr. H. M. Cox for a critical review of the ideas set forth 
in this paper. 


(1) 
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in which a, 8, y and 6 are the proportions occupying the four cells 
of a fourfold table, 


p is the sum of the first row and equals a + , 
qis the sum of the second row, y + 6, 
p’ is the sum of the first column, ora + 7, 

and q’ is the sum of the second column, or 8 + 6. 


When applied to the correlation of an item with a criterion, as 
under the conditions previously specified, the computation of ¢ is 
much simplified for the reason that p’ = q' = .5, since the populations 
in upper and lower criterion groups are equal in size. The fourfold 
table appears as follows, if we make certain substitutions in symbols: 


U L 


Pl pi | 


qu q | 
5| 5 | 1.00 


In this table U and L stand for the upper and lower criterion groups. 
P and F stand for the passing and failing populations. The symbol 
p’, stands for the proportion of the entire criterion population (in- 
cluding both upper and lower groups) who are both in the upper 
group and pass the item. The other primed symbols are to be inter- 
preted in analogous manner. 

With these symbols the formula for ¢ becomes 


— 
, (2) 
5V pq 
From the fourfold table given above, 
= .5- P's 
and 
= 6 — 


Substituting these for q’, and q’; in (2), 


5V pq 


; 
: | 
| | | 
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5 (p'u — 
(3) 
5V pq 
D'u 
Vpq 

But it is usually more convenient to derive from the test data 
the proportions of the upper and lower groups alone who pass an 
item. Let these proportions be designated by p, and p; , respectively. 

Pu — Di 


and therefore, substituting in (3), bike ‘ 
2V pq 


which is the formula proposed for practical use. It requires only the 
knowledge of the proportion of successes in the upper and lower cri- 
terion groups( p, and p;) and the proportions of successes and fail- 
ures in the two groups combined (p and q). The constant p is deriv- 
able directly from p, and p, , being given by 


— Put 
2 


and g=1— 

It may pointed out in passing that the simple salle of item 
analysis which employs merely the index (p, — »:) ingores entirely 
the dispersion of individuals on the item variable. The expression 
Vpq in the denominator of (4) gives the standard deviation of the 
item provided we assume a symmetrical dispersion of the item. The 
S. D. of the item is at a maximum when p = q = .5 and approaches 
zero when » approaches 0 or 1.00. The ¢ index of validity of an item 
is seen to be directly proportional to the difference in proportion of 
successes in the two criterion groups, but at the same time it is in- 
versely proportional to the variability of the item. If ¢ is adopted as 
the index of validity, then it follows that the same numerical differ- 
ence (p, — p:) is more significant when p approaches 0 or 1. aes and 
least significant when p = q = .50. 

In order to facilitate the computation of ¢, an abac was con- 
structed as shown in Fig. 1.* In using the graphic solution of ¢, 


*I am indebted to N.Y.A. assistance in the preparation of the diagrams in 
Figures 1 and 2. 
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only the values of p, and p; need be known. p, is to be sought on the 
ordinate and p, on the abscissa. The intersection of the two lines cor- 
responding to them gives the value of ¢ which is sought. 

The method of constructing the abac may be of interest. Begin- 


ning with the equation (4) and multiplying through by \/pq, we 
have 


By letting ¢ take on constant values in turn, one can readily deter- 
mine the lines of equal ¢. For example, if we are interested in the 
line for ¢ = .10, letting p take on various values we can find a set of 
coordinates which will determine the line. When p = .9, Vpq = 3, 
and ¢\Vpq = .03. From equation (5), p, then equals .93 and p, 
equals .87. Other coordinates are similarly computed. 

The lower right-hand part of the abac is a mirror reflection of 
the upper-left part, with p, and p, having interchanged roles. The 
diagram is given in complete form with all negative ¢’s not only be- 
cause. an occasional item correlates negatively with the criterion but 
also because the abac has general application to problems other than 
that of item analysis, remembering, of course, that it applies only 
when p’ = q = .50, and that p, = 2p’, and p, = 2p’,. Presumably 
other abacs could be constructed on the same principle for varying 
values of p’ and q’, and a set of graphic charts having a role similar 
to that of the Thurstone tetrachoric diagrams (1) would be the re- 
sult. It is doubtful, however, whether such charts would have suffici- 
ent general application to justify their publication. 

There is a question as to the comparability of a ¢ coefficient un- 
der different conditions of separation of upper and lower criterion 
groups. Exclusion of the middle segment of an entire criterion popu- 
lation, when the variable is continuous, usually has the effect of aug- 
menting the numerical index of correlation. But so long as a con- 
‘stant rule of division is followed, upper versus lower quarters, for 

example, the ¢ coefficients should be comparable. Phi coefficients from 

situations having varied degrees of exclusion of middle cases could 
perhaps be made numerically consistent, but the writer has not at- 
tempted to provide this refinement. As a matter of fact, owing to 
varying ranges of talent or of trait difference in different popula- 
tions, even the consistent use of the same division, even of halves, 
may not guarantee numerical consistency for any kind of index of 
validity. 

The question of size of significant ¢ coefficients for varying val- 
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ues of N will be discussed following the presentation of the second 
method, for it depends upon the size of chi square. 


The Chi-Square Method 


With all the variety of procedures for determining item validity 
it is surprising that mention of the chi-square technique has not more 
frequently appeared. The direct relation of chi square to ¢ immedi- 
ately suggests the use of this statistic which is so prominent in the 
newer statistics of small samples. The fact that tests of significance 
are so readily forthcoming when chi square is computed makes this 
statistic appealing in connection with item analysis. The relation to 
¢ is given by the equation 


N (Du — Pi)? 
6 
roe (6) 


to use the particular formula for ¢ which applies to the special case 
of item analysis (formula (4) ). 

It would be possible, of course, to compute values of chi square 
for all items, either using the formula just given, or by means of 
nomographs designed for the purpose. The parameter N is an addi- 
tional element in the situation, however, and it necessitates a series 
of abacs rather than just one as in the case of the computation of ¢. 
Instead of this extended computational practice, the writer recom- 


= N¢? = 


'mends a simple test of significance of item validity for varying values 


of N. The size of N (the number of individuals in the upper and 
lower criterion groups combined) is usually chosen for convenience 
as some multiple of 10. An N of 50 is probably the minimal practical 
value with which to make a test of validity of items and an N of 400 
is probably the maximal practical value. The abac (see Fig. 2) was 
prepared to show the lines of “significant” and “very significant” chi 
squares for N’s of 50, 100, 200, and 400.* The locus of the coordi- 
nates for a constant value of N was determined as follows. Multi- 
plying equation (6) through by 4pq/N , we have 


4 2 

(7) 
When N = 50, ; 

08pqy? = (Du — . (8) 


*The designations “significant” and “very significant” have Fisher’s usual 
meaning. A “significant” chi square could occur in a truly random situation 5 
times in 100 (P = .05) and a “very —— chi square could occur similarly 
only once in a hundred times (P = .01). 
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Reference to Fisher’s table of chi squares (2) shows that for one de- 
gree of freedom, which obtains in a fourfold table, it requires a chi 
square of 6.635 to be called very significant. Substituting this value 
in equation (8), we have 


.5308pq = (pu — 


Taking the positive square roots we have, 


.7285\/ Pd = Du — Di. (9) 
Letting p take on different values, sets of coordinates for a constant 
chi square of 6.635 and an N of 50 can be computed. For example, if 
p = 8, Vpq = 4. Then from (9), pu — p: = .2914. But p, — p; 
= 2(p. — p) = 2(p — p.). Therefore, the two coordinates we seek 
are: 


= .800 + = = .946 
and 
pi .800 — .654 , 


to use three significant figures. 

Other pairs of coordinates were computed in a similar manner, 
for other values of » and for N’s of 100, 200, and 400. In using the 
abac in Fig. 2, any item whose coordinates p, and p; locate it above 
and to the left of the curved line appropriate to the N of our criterion 
population may be regarded as significantly valid or as very signifi- 
cantly valid according as its chi square is greater than 3.841 (P = 
.05) or 6.635 (P = .01), respectively. The line of zero chi square 
would of course be defined by the equation p, = y and it would form 
the diagonal of the chart. It was found empirically that the lines for 
a chi square of 3.841 were practically identical with those for a chi 
square of 6.635 when N is doubled. From this it follows that in prac: 
tice, doubling N when p, and p; remain constant has the effect of low- 
ering the probability of a random situation from .05 to .01, or of 
moving an item from the category of “significantly valid” to that of 
“very significantly valid.” 

The use of the chi-square chart in Fig 2 has the limitation of 
merely discriminating between significantly versus non-significantly 
valid items when it is used as already described. It would not place 
items in the order of their validity. Rank order of the valid items 
may be approximated, however, by plotting each item as a point with- 
in the diagram and noting the relative diagonal distance from the line 


of least significance. 
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Because of the close relationship between chi square and ¢, tests 
of significance of ¢ can be readily inferred from the tests of signifi- 
cance of chi square. In other words, we can determine the least “sig- 
nificant” and least “very significant” values of ¢ for different values 
of N. It is simply a matter of determining the values of ¢ that are 
equivalent to chi squares of 3.841 and 6.635, respectively, for differ- 
ent values of N. A brief list of these significant ¢’s is given in Table 
1. The table has a small number of values of N, but their range is 


TABLE 1 


Minimum Values of ¢ That are Significant and 
Very Significant for Various Sizes of N 


— X2— 6.635, P= .01 


N 
20 438 576 
30 358 470 
40 310 407 
50 277 364 
70 308 
100 196 
150 .210 
200 182 
300 112 150 
400 .098 129 
600 -080 105 


more extensive than is needed in ordinary item-analysis problems 
with the idea that it may be useful elsewhere.* Its use in connection 
with ¢ coefficients should proceed with caution, however, particularly 


when dealing with populations from which middle cases have been 
excluded. In continuous distributions it is probably only strictly ap- © 


plicable when upper and lower halves are included. When the table 
is applicable, it naturally obviates the necessity of computing stand- 
ard errors of ¢. 
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* Even when upper and lower quarters of a criterion population do not ex- 
actly equal multiples of 10, a random sampling will make them do so, for general 
— as well as for convenience in using the abacs and the table presented 
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Abac for the graphic estimation of ¢ as an index of item validity. P,, = the 
proportion of the upper criterion group who pass the item, and p, is the similar 
proportion of the lower criterion group. 
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THE APPLICATION OF SHEPPARD’S CORRECTION 
FOR GROUPING 


GEORGE A. FERGUSON 
DEPARTMENT OF EDUCATION, UNIVERSITY OF EDINBURGH 


This paper attempts to show in a non-mathematical way the in- 
fluence of grouping on standard deviations and correlations, and ad- 
vances empirical evidence to illustrate with what accuracy values 
corrected for grouping by Sheppard’s correction approximate to val- 
ues obtained from ungrouped data when the distributions are con- 
tinuous. This enquiry gained its initial stimulus from the observa- 
tion that many standard deviations and correlations reported by stu- 
dents of psychology and education are uncorrected for grouping and 
that frequently errors attributable to the grouping of data are not 
small when compared with errors of sampling. 


In the calculation of statistical measures from grouped data the 
values of each variate within a given class-interval are assigned the 
value of the mid-point of that interval. Thus in the calculation of a 
correlation coefficient from such data we are not calculating the rela- 
tionship between the continuous variates x and y , but rather the rela- 
tionship between the midpoints of certain class-intervals into which 
the variates x and y have been grouped. With distributions that taper 
off to zero at the extremities, the point of concentration of the vari- 
ate is not the mid-point of the class-interval but a point slightly near- 
er the mean. Thus statistical measures calculated from the odd mo- 
ments remain almost uninfluenced by grouping, because the errors 
made by the assumption that the scores are concentrated at the mid- 
point of each interval tend to balance on both sides of the mean, 
while with the even moments the errors do not balance but add to- 
gether. 

Grouping error increases the size of the uncorrected standard 
deviations, and reduces the size of the uncorrected correlations. The 
usual formula for correcting a standard deviation for grouping is as 
follows: 


(1) 


where s, § are the corrected and uncorrected estimates, respectively, 
of the standard deviation, and 7 is the class-interval. 

The correction to be applied to a correlation coefficient for group- 
ing depends on the observation that when the distributions of the two 
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correlated variates tail off at the extremities the quantity 7., &, é, 
is independent of the class-interval used. It immediately follows from 
this observation that 


(2) 
where 7,, and 7,, are the uncorrected and corrected values of the cor- 
relation between x and y. Since, however, 7, & & = > xy/N, the 
usual product-moment formula for a correlation coefficient corrected 
for grouping may be written as follows: 


> xy 


ty? 
N 
where 7, and i, represent the class-intervals of x and y, respectively. 


When correlation coefficients are calculated by the diagonal adding 
method, the formula for a corrected coefficient becomes 


, (3) 


Vey 


= (4) 


where H , V, and D represent the sum of the squares of the devia- 
tions from the mean values of x, y, and x—y, respectively. 

R. A. Fisher* has pointed out that in averaging correlation co- 
efficients the values of z should be obtained from uncorrected values 
of r and that a correction equivalent to the average correction of the 
averaged values of r should be added to the resulting coefficient. 

The corrected value of r is always larger than the uncorrected 
value of r. The larger the value of r the larger the absolute value of 
the correction to be made for grouping. The relative value of the cor- 
rection is constant, given constant values for the standard deviations 
of the variates correlated. The size of the correction is independent 
of N, the number of cases. Errors introduced by using uncorrected 
values of 7 when ¢ is large are much more significant than errors re- 
sulting from a corresponding grouping when 7 is small. Not only is 
the absolute discrepancy between the uncorrected and the corrected 
value of r greater when r is large, but small differences between large 
correlations represent a much greater difference in the degree of re- 
lationship between the variates correlated than equivalent differences 


between small coefficients. 


*R. A. Fisher. Statistical methods for research workers, p. 211. London: 
Oliver and Boyd, 1938. 
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EXPERIMENTAL. ‘To estimate the accuracy with which values 
corrected for grouping approximate to values obtained from un- 
grouped data when the distribution is regarded as continuous, the 
1.Q.’s of 952 children on two intelligence tests were plotted on a grid 
with a class-interval of unity. The two distributions of scores were 
approximately normal. The standard deviations of the two variables, 
and the correlation between them were calculated. The class-interval 
was then successively increased by telescoping, at it were, the original 
grid, and further standard deviations and correlations were calcu- 
lated with class-intervals of 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 16, 18, 
and 20. 

Table 1 gives the uncorrected and corrected standard deviations 
for variable x at different units of class interval, and the number of ar- 
rays upon which each measure is based. The corrected standard devia- 
tion with a class-interval of unity is taken as the standard, and the 
deviations from this standard of the uncorrected and corrected stand- 
ard deviations, calculated at each step interval, are given in columns 
d, and d. , respectively. It will be observed that the uncorrected stand- 
ard deviation with a class-interval of unity is the same as would 
have been obtained from ungrouped data. This value is, however, 
corrected on the basis of the assumption that the distribution is the- 
oretically continuous. 


TABLE 1 
class- no. of 82 Sz di de 
interval arrays (uncorrected) (corrected) 
1 60 12.1550 12.1516 .0034 .0000 
2 30 12.1549 12.1412 .0033 —.0104 
3 20 12.1634 12.1825 .0118 —.0191 
4 15 12.1740 12.1191 .0224 —.0325 
5 12 12.1175 12.0313 —.1203 
6 10 12.1836 12.0599 -0320 —.0917 
7 9 12.3123 12.1452 .1607 —.0064 
8 8 12.4592 12.2433 .3076 .0917 
9 12.6432 12.3734 4916 .2218 
10 6 12.4806 12.1421 8290 .0095 
12 5 12.4620 11.9708. —.1808 
14 5 12.6512 11.9883 4996 —.1633 
16 4 12.8747 12.0177 -7231 —.1339 
18 4 13.1897 12.1230 1.0381 —.0286 
20 3 13.3611 12.0493 1.2095 —.1023 


Table 2 furnishes corresponding data for variable y. These data 
indicate that the application of Sheppard’s correction results in an 
estimate of the standard deviation closely approximating to the value 
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that would have obtained from an ungrouped continuous variate. 
Certain relatively large discrepancies in the corrected values occas- 
ionally appear and are due to the purely arbitrary nature of the 
points fixed as the top of the last class interval and the bottom of 


the first. 


TABLE 2 
class- no. of — Se Se di de 
interval arrays (uncorrected) (corrected) 

1 55 11.2309 11.2272 .0037 -0000 
2 28 11.2563 11.2416 -0291 .0144 
3 19 11.2518 11.2184 .0246 —.0088 
4 14 11.3123 11.2523 -0851 .0260 
5 11 11.3570 11.2645 .1298 
6 10 11.3988 11.2664 .1716 0392 
7 8 11.3421 11.1595 .1149 .0677 
8 11.4128 11.1768 .1856 -0504 
9 7 11.5848 11.2896 .3576 .0624 
10 6 11.5273 11.1600 .3001 —.0872 
12 5 11.8006 11.2807 -5634 -0535 
14 4 11.5885 10.8608 3613 —.3664 
16 4 11.7920 10.8498 -5648 —.3774 
18 4 12.5132 11.3834 1.2860 1562 
20 3 11.1442 1.8238 —.0830 


12.5510 


In the calculation oi correlation coefficients by the diagonal add- 
ing method, adding along one diagonal of the correlation grid yields 
the distribution of the sum of the variates, while adding along the 
other diagonal yields the distribution of the difference between the 
variates. Thus if we wish to calculate the standard deviation of vari- 
ation in I.Q. between test and retest, instead of finding the actual dif- 
ference in I.Q. for every child and making a distribution of these dif- 
ferences, we may find the standard deviation of difference in I.Q. di- 
rectly from the appropriate diagonal distribution. A peculiarity ex- 
ists, however, in the grouping of the diagonal distribution which 
makes the standard deviation of x — y calculated from it greater than 
the standard deviation of « — y calculated from the distribution ob- 
tained by subtracting the appropriate values of y from x and group- 
ing the differences with class-interval equal to that of « and y in the 
original grid. We may readily correct for this peculiarity since the 
variance of the diagonal distribution is greater than the variance of 
a distribution of actual differences of the same class-interval by an 

amount equal to - . Thus if the latter standard deviation is cor- 
rected for grouping once, the former must be corrected twice. This 
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point is capable of further illustration by reference to the formula 

Since the term 7,,§,§, is invariant, it is apparent that the term 

§*(.) must be corrected twice if §,* and §,? are each corrected and the 

equation is to be satisfied. Furthermore, this simple observation in- 


dicates why there is no correction in any portion of the numerator of 
formula (4). The numerator of this formula really reads 


N N 2N 
(H-55) + - (D-55) » 
so that the corrections cancel one another, leaving the numerator in- 


variant. 
To illustrate the foregoing discussion, the standard deviation of 


the diagonal] distribution was calculated at different class-intervals, 
and these values uncorrected, corrected once, and corrected twice are 
given in Table 3. The standard deviation of the difference with class- 
interval unity is taken as the standard value, and the deviations d,, 
d,, d; of the standard deviations at different class-intervals, uncor- 
rected, corrected once, and corrected twice, from this standard value 
are given. It is apparent from an examination of the data in this 
table that twice Sheppard’s correction is the correction required. 


TABLE 3 
8(2-) 8(2-y) 
class- no. of §(x-y) (corrected (corrected 
interval arrays (uncorrected) once) twice) di de ds 
1 35 5.9401 5.9331 5.9261 .0140 .0070 .0000 
2 18 5.9662 5.9382 5.9101 .0401 .0121 .0160 
3 12 6.0219 5.9592 5.8960 .0958 .0331 —.0301 
4 10 6.1348 6.0251 5.9134 .2087 .0990 —.0157 
5 10 6.2517 6.0828 5.9091 .3256 .1567 —.0170 
6 6.2382 5.9929 5.7371 .0668 —.1890 
7 7 6.5170 6.1958 5.8570 .5909 .2697 —.0691 
8 6 6.6952 6.2843 5.8428 -7691 .38582 —.0833 
9 5 7.0182 6.5196 5.9794 1.0921 .5935 .0533 
10 6 7.2845 6.6836 6.0330 1.3584 -7575 -1069 
12 5 7.4767 6.6258 5.6481 1.5506 .6997 —.2780 
14 5 8.3318 7.2860 6.0624 2.4057 1.3599 .1363 
16 3 8.5042 7.1406 5.4456 2.5781 1.2145 —.4805 
18 3 9.1499 7.5313 5.4516 3.2238 1.6052 —.4745 
20 3 9.9576 8.11380 5.6997 4.0315 2.1869 —.2264 


The correlations between the variates x and y were also calcu- 
lated at different units of class-interval. These values are given in 
Table 4. Here again the corrected value with class-interval unity is 
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taken as the standard, and the deviations d, and d. of the obtained 
and corrected values of r from this standard are calculated. A very 
substantial decrease in the value of r with decrease in the number of 
arrays into which the variates are grouped is observed. The discrep- 
ancy between the uncorrected and corrected values of r is such as to 
furnish sound support to the conclusion that correlation coefficients 
must be corrected for grouping if accurate statistics are desired. 
These data are indicative that Sheppard’s correction furnishes a re- 
markably accurate estimate of the correlation that would have ob- 
tained from ungrouped data with continuous variates. 


TABLE 4 
no. of no. of 
class- arrays arrays Fey Tey 
interval x y (uncorrected) (corrected) di de 

1 60 55 8739 .8744 .0005 .0000 
2 30 28 .8750 0006 
3 20 19 .8706 .8754 .0038 
4 15 14 .8661 .0083 0002 
5 12 11 —.0011 
6 10 10 .8812 .0123 ' 
8 8 -8462 .8793 .0282 -0049 
9 7 sf .8357 .8762 .0387 0018 
10 6 6 .8187 .8692 .0557 0052 
12 5 5 .8836 -0630 0092 
14 5 4 -7671 -8638 .1073 —.0106 
16 4 4 -7656 .8914 -1088 .0170 
18 4 4 -7478 .8943 .1266 
20 3 3 -7063 .8821 -1681 


In order to examine the functioning of Sheppard’s correction 
with a small value of r, a new grid was drawn up with 1828 cases. 
Values of r were found as before at successive class-intervals. Table 
5 gives values of r uncorrected and corrected for different class-in- 
tervals. The deviations of the uncorrected and corrected values, re- 
spectively, from a standard value .3672 are given in columns d, 
and d.. The number of arrays is given, in this case the number of 
arrays of the x variable being equal to the number of arrays of the 
y variable for each value of r. 

It will be observed that the d, column of Table 4 is in every case 
greater than the d, column of Table 5, illustrating that the larger the 
value of r the larger the absolute value of Sheppard’s correction, and 
emphasizing that correcting for grouping is of much more impor- 
tance when r is large than when r is small. Examination of the d, 
columns of Tables 4 and 5 shows that Sheppard’s correction furnishes 
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TABLE 5 
class- no. of Fey Tey 
interval arrays (uncorrected) (corrected) di dz 

1 60 .3670 -0002 .0000 

class- no. of Tey Tey 
.3672 -0009 .0000 
3 20 .3648 .3668 .0024 .0004 
4 15 .3632 -3667 .0040 .0005 
5 12 3613 .3667 .0059 .0005 
6 10 -3660 .0091 .0012 
7 9 .3548 .8653 .0124 .0019 
8 8 3520. .0152 .0014 
9 7 38514 .0158 —.0013 
10 6 .3457 .3669 .0003 
12 5 .3340 
14 5 .8452 38873 .0220 —.0101 
16 4 .0538 .0056 
18 4 .3112 38758 -0560 —.0086 
20 3 .2729 8423 .0943 .0249 


a remarkably accurate estimate of the correlation that wou!d have 
obtained from ungrouped data with continuous variates. Further- 
more, if there is reason to believe that the distributions of the two 
correlated variables have “high contact,” some work can be avoided 
by using a coarse grouping with a small number of arrays and cor- 
recting for grouping. Tables 4 and 5 show that quite accurate results 
can be obtained with as few as six arrays. With fewer than six ar- 
rays the purely arbitrary position of the class-intervals will in most 
cases lead to slight discrepancies in the corrected value of 7. 

SUMMARY. If the distributions of variates used in statistical 
work taper off to zero at the extremities, the use of Sheppard’s cor- 
rection furnishes accurate estimates of the standard deviations and 
correlations that would have resulted from the use of ungrouped data. 
Correcting a correlation coefficient for grouping is essential when the 
grouping is coarse and the number of arrays is large. Otherwise in- 
accurate statistics will result. The discrepancies found in small cor- 
relations due to failure to correct for grouping are of less importance. 
Sheppard’s correction for grouping should be used whenever the 
grouping error cannot be considered small in relation to errors of 
random sampling. 
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ON THE RECTILINEAR PREDICTION OF OBLIQUE FACTORS 


HARRY H. HARMAN 
UNIVERSITY OF CHICAGO 


The general problem of estimating correlated or uncorrelated 
factors is treated. It is specifically indicated wherein the prediction 
of oblique factors differs from that of orthogonal factors. A short- 
ened method of estimation of correlated factors is developed. 


Factor analysis is concerned primarily with two basic problems: 
(1) the linear resolution of a set of variables in terms of hypothetical 
factors; and (2) the description of these factors in terms of the ob- 
served variables. There are many and sundry methods for the solu- 
tion of the first of these problems.* A common feature of most of 
these methods, however, is that for a set of » variables the final form 
of solution reveals m(< ”) common factors and n unique factors. 
Since the total number of factors exceeds the number of variables, the 
value of any particular factor for a given individual can only be esti- 
mated from the observed values of the variables. The best prediction, 
in the least-square sense, is that obtained by the ordinary regression 
method. It is thus apparent that the solution of the second problem 
is in the nature of a linear regression equation. 

Let it be assumed that a set of n variables has been analyzed in 
terms of m common factors and n unique factors, as follows: 


= Oj F , + + m + (1) 


The observed variables and the factors are standardized over the sam- 
ple of N individuals, so that the standard value of the variable z; for 
an individual 7 may be written explicitly in the form: 


F + + eee + QimE mi + a; (1’) 


In general, the form of equation (1) is preferred, the secondary sub- 
script for the individual values being understood. A set of equations 
of the type (1) is said to be a factor pattern. In this definition no 
assumption is made as to the correlations among the factors. It is 
usually postulated that the unique factors are uncorrelated among 
themselves and with all common factors. The common factors, how- 


* For a description of the methods leading to the “preferred types” of factor 
solutions see (1, Part II). 
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ever, may or may not be correlated. A factor pattern may be written 
conveniently in the form of a matrix equation, namely, 


Z—MF, (2) 
where the column vectors 
Z = Zn} 
F= {F,--- Fy, 


represent the variables and factors, respectively, and the matrix M, 
which consists of the matrix A of common factor coefficients and the 
matrix U of unique factor coefficients, is defined by 


|| 


| Aya +++ Aym A, 
M=||AU\|=| Ay, +++ Aom O +++ 0 


Anz *** Anm 0 0 An 


When there can be no confusion, the pattern matrix M of factor co- 
efficients may be referred to as the factor pattern. 


The rectilinear prediction of any Factor F, involves the deter- 
mination of the coefficients in the regression equation 


where the (n—1) secondary subscripts have been omitted from the f’s 
for convenience. A similar equation can be written for any one of the 
unique factors. It will be convenient to have the symbol F, represent 
any one of the complete set of factors F, , F.,---,F'm,U;,, U2,+++,Un. 
In general, then, 


F=BzZ, (4) 


where F represents the matrix of factor estimates and B the matrix 
of regression coefficients. 

By the ordinary method for obtaining the regression of one vari- 
able on n others, any factor F’, can be estimated when the correlations 
of the variables with this factor and among themselves are known. 
Setting 


1 Vr,2, TP ,z, 
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the regression coefficient of z; in the estimation of F, is given by 


where D,, is the minor of the first element and D,; is the cofactor of 
Tr,2; in D. It may be noted that D is merely the matrix R of the ob- 
served correlations bordered by the correlations of the variables with 
F,. Then D,, = R and the determinants D,; can be expressed in terms 
of the cofactors of the original correlation matrix. Thus the values 
(5) may be written explicitly in the form: 


1 
where R;; is the cofactor of 7;,; in the correlation determinant FR. 
The regression equation (3) for F, may then be written: 
= tes One| , (7) 


where tj, = 1Y2,r,. In particular, if F, is one of the unique factors, 
say U;, this equation becomes: 


(8) 
or, in expanded form, 


+ + + ; (3’) 


where ¢; = r.,y,. The matrix equation (4) for the prediction of the 
entire set of factors finally becomes: 


F=TR-Z, (9) 
where 7” is the transpose of the factor structure, 
| ty tie tim t, 0 vee 0 | 
te: tem O & O 
| tas 0 0 


All the factors, whether they be correlated or not, can be esti- 
mated by means of equation (9). When the factors are mutually or- 
thogonal, however, it can be shown (2, p. 324) that the elements 
of M are equal to the corresponding elements of T', i.e., the pattern 
and structure coincide. Then equation (9) may be written as follows: 
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F=MR-Z (uncorrelated factors). (10) 


On the other hand, if the factors are correlated equation (10) does not 
apply, and the distinction between pattern and structure must be 
clearly understood (1, Section 2.4 and Chapter XI). 

The use of equation (10) has been the conventional method for 
estimating orthogonal factors. Spearman’s explicit formula (3, Ap- 
pendix) for a single general factor is a special case of this equation, 
and the methods employed by Thurstone (4) and Thomson (5) for 
the case of several common factors are adaptations of this formula. 
Thomson also gave the explicit formula (8’) for the estimation of 
unique factors. 

It will be noted that in the application of formula (10) the evalua- 
tion of R-“ is involved. This may be very laborious if the number of 
variables is large. For this reason, the author (6) has developed sev- 
eral approximations to this formula, which involve the grouping of 
certain variables to reduce the order of the matrix whose inverse is 
required. More recently, Ledermann (7) and Guttman (8) have de- 
veloped a shortened method for the estimation of orthogonal factors 
which involves the inverse of an m X m matrix instead of the inverse 
of an m X m matrix and which is just as accurate as the longer method. 
The effect of this method is to replace the observed correlations by 
those reproduced (or computed) from the factor pattern. 

Before the shortened method is generalized to the case of oblique 
factors, a formula will be derived for the prediction of correlated fac- 
tors which explicitly employs the pattern matrix M. Let the matrix 
of correlations among the common factors be defined by: 


| 
‘= Yr.P, 1 
|| 1 | 


so that the matrix of correlations among all the factors is 
0 
0/7 


By a fundamental theorem of factor analysis (1, 2, 9), the matrix of 
correlations of the variables can be expressed in the form: 


R=M@M'. (11) 


| 


Then, noting that 
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M@ R-=RR“=1, (12) 


and premultiplying the members of (2) by the members of (12) there 


results: 
M@M' R*Z—=MF. 


The premultiplier M , although not square, may be dropped from both 
sides, and, again using a bar to denote estimated values, the equation 
finally becomes: 


F=@M'R“Z. (13) 


This equation reduces to (10) when all the factors are uncorrelated. 
Since the unique factors are of minor interest in factor analysis, 
it is convenient to write equation (13) for the common factors only, 


namely, 
f= (14) 


where the small f is used to denote the column vector {F, F2 +++ Fyn}. 
In case the common factors are uncorrelated, this equation reduces to: 


f=A'R-Z (uncorrelated factors). (15) 


In formulas (13) and (14) the pattern values are employed instead 
of the structure values. 

Now the method of Ledermann and Guttman will be generalized 
to the case of per common factors. From (11), 


Then, premultiplying R by A’U- the following expression is obtained, 
which ultimately leads to the desired result: 


A'U2R = A'U-(AgA’ + U*) = (A'U-2Ag + I) A’ 
A'U2R= (1+ K)A’, (17) 


where K = A’'U“Ag. Premultiplying both members of (17) by 
(1 + K)~ and postmultiplying by R-', this equation becomes 


(1+ (18) 
Then, substituting (18) into (14), there finally results: 


| A’ 
R=||AU||- | AgA’ + U2. 


| 


=¢(1+ K)“A'UZ. (19) 


4 
» 
> 
(16) ii 
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This equation indicates that, even when the factors are correlated, 
only the inverse of an m X m matrix is required. 

To show the general character of formula (19), it is convenient 
to write it in another form. Premultiply both sides of (19) by 
[¢(1 + K)-]-, we get: 


(1+ K)¢*# =AU-Z. (20) 


By noting the order of each constituent matrix in (20), it will be ob- 
served that each of the two matrices resulting from the multiplica- 
tions contains m rows and one column. This matrix equation thus rep- 
resents a system of m algebraic equations, obtained by setting the cor- 
responding elements equal to each other. The elements of the ma- 
trices in the right-hand member of (20) are obtained simply enough, 
but the expression on the left appears to be rather complex. This can 
be simplified, however. Substituting the definition of K in the pre- 


multiplier of f , this expression becomes: 
(1 + K)¢?= (1+ A'U7A¢) = + A'U”A). (21) 


The system of m equations for the estimation of the common factors 
may finally be written in the form: 


+ A'UA)f = A'U~Z. (22) 


From this equation it is evident that, first, the inverse of the m X m 
matrix of intercorrelations of factors is calculated, to this is added 
the m X m matrix (A’U-?A), and then the inverse of this composite 
m X m matrix is required in order to obtain the explicit equations 
for the estimation of oblique factors. The expression (22) includes 
Ledermann’s formula (7, eq. 11) as a special case, namely, when ¢ is 
the identity matrix. There is also some similarity between this for- 
mula and the one derived by Bartlett (10, eq. 4). Of course, since 
Bartlett has assumed a different principle of estimation than that em- 
ployed in this paper, his formula is not a special instance of (22). 
The analogy exists in the fact that if the term ¢-? is dropped in equa- 
tion (22), it becomes identical with Bartlett’s formula. 
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GENETIC EMERGENCE OF FACTOR SPECIFICITY 


T. W. RICHARDS 


SAMUEL S. FELS RESEARCH INSTITUTE 
YELLOW SPRINGS, OHIO 


Mental test data of Chrysostomm and of Garrett, Bryan, and 
Perl are reinvestigated in order to determine the shifts in test clus- 
ters with chronological age. It is found that the factors under sur- 
vey tend to become more independent with increasing age. 


One theory of the genesis of mental organization is that mental 
abilities become more specific as children grow older. According to 
this notion, it might be supposed that fairly discrete or uncorrelated 
factors at a given age level might be expected to overlap at an earlier 
age level. Two studies of test intercorrelations at succeeding age 
levels were used to determine if possible the progressive shifts in test 
clusters with chronological age. Garrett, Bryan, and Perl (2) used 
ten tests on nine-, twelve- and fifteen-year-old-children, both boys and 
girls. Six matrices of intercorrelations were thus obtained. The tests 
used are described in some detail in the report (2). At least five were 
“memory” tests, and four were “non-memory.” Chrysostom (1) used 
nine tests with Belgian school boys in the fourth, fifth, and sixth 
grades. His material has not, as far as I know, been published except 
as an abstract (1). Chrysostom’s tests consisted of French adapta- 
tions of five Stanford Achievement Tests (paragraph meaning, sen- 
tence meaning, word meaning, arithmetic problems, arithmetic com- 
putation), a substitution test taken from Woodworth and Wells’ 
Association Tests and from the Army Mental Tests, following dtrec- 
tions from Burt’s Mental and Scholastic Tests, and an information 
test, taken from Haggerty’s Intelligence Examination Delta, for 
grades 3 to 9. These were described as above in personal communica- 
tion (Dec., 1939). 

The numbers of cases used in the various groups were as follows: 


Garrett, et al Chrysostom (boys) 
Boys Girls 
nine years 108 117 4th grade 56 
twelve years 96 100 5th grade 57 
fifteen years 102 123 6th grade 54 


Rotation of axes for these nine matrices involved extraction of 
37 
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factors for Chrysostom’s material, and recalculating much of the 
Garrett material. In the case of each matrix, two factors seemed 
significant. However, for Chrysostom’s fifth-grade group and Gar- 
rett’s fifteen-year-old boys it was necessary to use the third extracted 
factor rotated with the first, rather than k, and k, as in every other 
case. 

It appeared to me that for the oldest age groups in each case, 
Chrysostom’s sixth-grade and Garrett’s two fifteen-year-old groups, 
two factors were established by rather clear clusters. Although nam- 
ing these factors seemed secondary to tracing the formation of clus- 
ters genetically, I might say that Chrysostom’s factors seemed to be 
a “verbal” factor, heaviest in tests 1, 2, 3, and 8 (see Table 1), and 
a “directions,” or ability to follow directions or solve problems, found 
in tests 4, 5, 6, and 7. The factors in Garrett’s material seem to me 
to be a memory factor found in tests 6, 7, 8, 9, and 10, and a (non- 
memory) factor, perhaps speed, found in tests 1, 2, 3, 4, and 5. 

Factor loadings appear in Tables 1 and 2. 

Figures 1, 2, and 3 illustrate the shift in clusters with chronolog- 


TABLE 1 
Unrotated and Rotated Values from Chrysostom (1) 
Unrotated Loadings 


Grade 
4 5 6 

Test K, K, K, K, K, K, K, 

1 Reading Paragraph -785 —.226 688 —.103 —.462 .675 .293 

2 Reading Sentences 697 —.226 561 —.120 —.149 

8 Word Meaning 735 —.414 -732 —.255 —.118 .671 508 

4 Arithmetic Problems .695 .258 .871 .267 102 -722 —.856 

5 Arithmetic Computation .659 -718 357 612 —.504 

6 Substitution 481 —.185 .389 —.301 

7 Directions .555 684 —.291 311 576 —.265 

8 Opposites 578 —.333 .285 —.242 673 826 

Rotated Loadings 
Grade 
4 5 6 

Test I Il h2 7 II K, ih I Wh 

1 Reading Paragraph .74 .85 .67 .78 18 —10 .63 .70 24 

2 Reading Sentences .67 .29 .54 .50 .29—12 .85 .68 .14 ~~ «48 

8 Word Meaning 68 17 ..71..60 44—26 62 284 08 7 

4 Arithmetic Problems .65 .55 .54 .69 .27 .84 .75 .65 
5 Arithmetic 

6 Substitution 16 69 89 002 568 —.14 86 .49 .26 

7 Directions 138 49 .26 .70—29 .65 25 ~ 
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TABLE 2 
Rotated Values from Garrett, Bryan, and Perl(2) 
Nine Years 
Boys 

I II h2 I 

39 18 31 
30 53 37 49 
-73 30 62 62 
26 —.03 12 63 
37 62 52 40 
—.04 62 39 
37 56 44 53 
84 15 14 37 
42 19 21 07 
56 61 67 15 


Objects 
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Vocabulary 
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Objects 


Making gates 
Vocabulary 
Arithmetic 

Paper Form Board 
Logical Prose 
Word-Word 

Word Retention 
Digit Span 


Geometric Forms —.04 .06 


Objects 
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ical age. In the case of the Chrysostom material, it is seen that the 
sixth-grade clusters (white) are most discrete, with the fourth-grade 
clusters (dotted shading) less discrete but more so than the fifth- 
grade material. Possibly the fifth-grade material represented a sam- 
pling of some sort less comparable with the other two grades than 


= 
: 
Arls 
II he 
45 .29 
30 ® 
31 AT 
09 -40 : 
65 57 
51 80 
1 31 
9 37 
I 7 22 
9 36 
Twelve Years 
Boys Girls ; 
I II h2 I II h2 
.40 .26 .23 82 .20 14 
68 25 55 82 —.06 67 
67 43 .64 .59 32 44 
39 57 46 57 22 
-76 .29 .66 61 14 .40 
13 46 28 35 .20 16 
.06 57 82 —.09 .60 36 
48 .04 2 .06 49 .25 : 
AT 15 16 10 -04 
.69 AT 23 -50 30 
Fifteen Years : 
Boys Girls 
I Il h? Ky Il h2 
AT 10 29 —.26 AT .08 .23 
52 OT 54 38 -70 07 49 
46 16 .00 52 
39 —.05 18 —.14 46 04 22 
28 16 19 .54 10 30 
.20 18 22 16 55 82 
.05 42 35 —.42 .00 57 82 
17 45 .26 —.20 16 19 .06 
| 18 18 24 09 
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these were with each other. The charts for the Garrett et al material 
show that for the fifteen-year-old girls there is a shift in discreteness 
from the ninth through the twelfth to the fifteenth year. The case 
for the boys is less clear, perhaps, but similar. Here there is consid- 


4th orade 
SS sth grade 
Sth orede 
- 
00 
204 
Problems 
Direchons. 
20 4 60 20 100 
FIGURE 1 


Factors I & II from Chrysostom material, for fourth-, fifth-, and sixth-grade boys. 


erable overlapping at the nine-year level between the clusters; at 
twelve years there is no overlapping but the memory cluster gets over 
into the non-memory area definitely, because of test 8. At fifteen 
years there is no overlapping, but total variance is low. 

The two most confusing levels, — Garrett’s fifteen-year-old boys 
where total variance was low, and Chrysostom’s fifth grade where 
‘there was a reversion in the age tendency from purity to impurity— 
‘were those cases in which it was necessary to extract a third factor 
to get the homologue to the second factor in each other case. 
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Since, as far as I know, these studies are the only ones showing 
a tendency for factors to become more independent or orthogonal 
with increasing age, it seemed that a brief report of them might be 


of interest. 


I 
Speed 
1.00 9 years 
| 
&0- 
$0- 
| 
20+ 
Memory 
I 


FIGURE 2 
Factors I & II from Garrett, Bryan, and Perl, for nine-, twelve-, and 
fifteen-year-old girls. 
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9 years 
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100 
FIGurE 3 
: Factors I & II from Garrett, Bryan, and Perl ,for nine-, twelve-, and 
fifteen-year-old boys. 
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NOTE ON THE MATHEMATICAL THEORY OF INTERACTION 
OF SOCIAL CLASSES 


N. RASHEVSKY 
THE UNIVERSITY OF CHICAGO 


In continuation of previous studies, the interreaction of two 
social classes is investigated from a somewhat different point of view. 
An exchange of results of activities of the two classes is considered, 
and a mathematical approach is outlined based on the use of such 
psychological concepts as the satisfaction function. 


In previous papers, (2, 3, 4) we have studied mathematically dif- 
ferent types of interaction of social classes. In one of the papers (2) 
we discussed interactions based mainly on psychological principles, 
while in others (2, 3) we considered an interaction that may perhaps 
best be called “economic,” though it also involved sociological and psy- 
chological mechanisms. In the present paper we shall discuss a possible 
mathematical approach from a still different point, which, while psy- 
chological in its essence, may be applied as well to interactions that 
resemble more the “economic” type. This approach consists in a gen- 
eralization to social classes of some concepts that have been studied 
for the case of single individuals. 

Consider again two coexisting classes I and II. Let both classes, 
amongst other activities, perform two given activities A and B, pro- 
ducing correspondingly per unit time some results a and b. Those 
results may be considered as “commodities,” either of a material na- 
ture, like food, or of a more abstract nature, like knowledge of some 
sort, that may be communicated to others. Let the amounts of the 
results of activities produced by the first class be a, and b,, by the 
second a. and b,. It may happen that a, is rather large while a, is 
small, and at the same time b, small while b. is large. In that case an 
exchange of “commodities” will take place, the first class receiving 
some b from the second and giving in return some a. 

To determine the character of that exchange, we shall use, as 
has been done by other authors (1, 5) the concept of satisfaction, 
which we may apply to a class as a whole, if it consists of approxi- 
mately similar individuals. That satisfaction, as a psychological quan- 
tity, may be actually, though indirectly, measured and discussed quan- 
titatively has been shown by L. L. Thurstone (5). Thurstone comes 
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to the conclusion, derived from psychophysical experimental evidence, 
that the satisfaction varies logarithmically with the amount of com- 
modity possessed, and he assumes moreover that for several commod- 
ities the satisfactions are simply additive. However, he makes some 
reservations as to the generality of the logarithmic relation. For the 
present general discussion, we shall not make any special assump- 
tions about the shape of the satisfaction function. Moreover, we shall 
consider the satisfaction not in terms of the quantities of commod- 
ities possessed, but in terms of the rates of productions of those com- 
modities. This is psychologically legitimate, for one may derive a 
greater satisfaction from producing or receiving per unit time more 
commodity. We shall speak therefore of “production” of a commodity 
by a class, if that commodity is received from outside, for instance 
from the other class. 

We assume that for any individual of class I there is a satisfac- 
tion function s,(x , y), where x and y are the amounts of the two com- 
modities in question received per unit time. Similarly for class II we 
have s.(x, y). If N, and Nz are the numbers of individuals in class 
I and class II, respectively, we shall define S,(x , y) = N,s,(x, y) and 
S.(x, y) = Nes2(x, y) as the satisfaction function for classes I and 


II, respectively. We shall put 


aS, 

(1) 
=X2(x ,y) ; =Y.,(x ,y). 


If each individual in a class, and therefore the class as a whole, 
agrees to such an exchange, for which his satisfaction has the largest 
possible value, and if this exchange goes on in such a way that for a 
unit of x always the same number of units of y are given, then we 
may calculate the results of such an exchange by using formulae de- 
veloped by G. Evans (1, pp. 125-128). Denoting by x, and y, the 
rates of production of x and y in class II when the exchange is operat- 
ing, and by x. and y, corresponding quantities for class J , we have for 
the determination of x,, y,, 2, and y, the following equations 


Xi (41, 
X,(@, + a, — + — 


(21, Y:) ‘ (2) 


¥2(a, + a, — + — 
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+ %=A, +; 
(3) 


+ Y2=b, + be. 


Equations (2) and (3) express x, , yi, %2, and in terms of a,, 01, 
a, and bz 
=U, (0, , be) = Ue (A, , , Ae , D2) 5 
(4) 
Yi = 42,62); Yo=V2(Gi, 01, be). 


Depending on the choice of the functions S, and S,, a different 
distribution of rates of production will be obtained. It is possible that 
while a, + 0, > a2 + 02, yet 4%, + Y1 < Xo + Yo. 

The quantities a, , b, , a2 , and b, refer to the class as a whole. The 
corresponding quantities per individual shall be denoted by @’,, #,, 
b’,, and b’,. Thus a, = N,@’,, etc. 

Consider the theoretical case in which at the time t = 0 class I 
consists of ”, individuals capable of activity A only, while class IJ 
is composed of 22 individuals capable of activity B only. We shall re- 
fer to them as individuals of type A and type B. An individual of 
type A is characterized by 


(5) 
An individual of type B is characterized by 
a=0,b'>0. (6) 


We thus have att = 0 
a.>0; @#,=0; b,>0. (7) 


Now consider a variation of the composition of each class due to 
a mechanism discussed in a previous paper (2), and based on a dis- 
similarity of offspring and parents. Denoting by 4, the number of 
individuals in the class J who are performing activity A , by n®, the 
number of individuals in class I performing activity B, and by 741 
and n®,, the corresponding quantities for class I], we find for the 
variations of these quantities with respect to time equations of the 
form of equations (43) and (46) of a previous paper (2). We thus 
have 


n4,=nA,(t); n4,—n4n(t); n®,(t) . (8) 


When class J is composed of individuals performing both activ- 
ities A and B, then the value of a, will be less than for the case of a 
class consisting of individuals performing only activity A. On the 
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other hand, b, will now not be zero. We have now 

2°. (9) 
Similarly, for the second class 

a,= 14, 0,; (10) 


Since the n’s are functions of time, a,, b,, a, and 0b, will also be 
known functions of time. Individuals of the two different types will 
have different satisfaction functions, s4(x, y) and 83(x, y). For the 
class as a whole we shall have 


S, = 74,8, + 2,83; N41 84 (11) 


If s, and sg are given, then, because of equation (8), S, and S, are 
known functions of time, S,(t), S.(¢). We may now obtain 2,, 2, 
y,, and y, as functions of time, by using equations (4), in which u,, 
U2, V,, and vz are derived from the corresponding S,(t¢) and S.(t), 
and in which the values a,, a., b,, and b., given by equations (9) 
and (10), are used. In this way we obtain x, + y, and 22 + Ye as 
functions of time. We thus may study the variation of the character- 
istic of the two classes with respect to time. 

In a previous paper (2), we have derived a condition for one 
class to control the activities of the other. Denoting by N, the total 
population of class I, and by N, the total population of class IJ, that 
is, in our case 


Ne=4,4+ ,,, (12) 


we found that under certain assumptions class J controls class II, if 


N,> N, 13) 
ede (N, 2) ( 
Condition (13) is obtained from condition (8) of a previous paper (2) 
by changing the notations, namely by putting there =0, 
a= 9.2, % —g,. The constants g, and g. depend, as we have seen 
before (loc. cit.), among other things on technical facilities possessed 
by the different classes. In general, therefore, g, will be a function of 
(x, + y:), while g, will be a function of (x. + y.). Hence 
9:=9:(t); 92=92(t). (14) 
In the inequality (13) all quantities involved are thus, for a pre- 
scribed shape of s,(x, y) and 8,(x, y), as well as of g,(a. + ¥:), 
92(X_. + y2), known functions of time. We may investigate, then, for 
what times the relation (13) holds or ceases to hold, by substituting 
an equality sign for the inequality and by solving with respect to t. 
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Thus it may happen that inequality (13) holds for ¢=0 but ceases 
to hold for t > t, > 0. This would mean that class J will control 
class JJ from t= 0 to t= #t,, but for ¢ > ¢,, the situation will change 
in accordance with equations developed before (2). Or inequality (13) 
may not hold for ¢ < ¢, but hold for ¢ > ¢,. In that case class I be- 
comes the controlling class at t = ¢,. Or else the equation obtained 
from (13) may have several roots: ¢, , & , etc., in which case we shall 
have a fluctuation of the control between the two classes. The ex- 
plicit form of the equation for ¢t, depends on the two mathematical 
assumptions: the shape of s4(x, y) and ss(x, y), and the shape of 
g, and g. as functions of (x + y). The latter assumption is more of 
a physical nature. It depends largely on the nature of the quantities 
xz and y — whether they are, for instance, material commodities, 
money, or some abstract knowledge. The simplest assumption is to 
make the g’s proportional to x + y, but other assumptions are also 
possible. As to the choice of the functions s,(x, y) and ss(x, y), this 
is determined by purely psychophysical assumptions. We may have 
either the form suggested by Thurstone or some other plausible forms. 
Keeping a given functional expression for the g’s and making differ- 
ent assumptions about the satisfaction function, we shall find differ- 
ent expressions for the variation of the interaction of social classes 
with respect to time. Thus we may study the effect of such a purely 
psychological factor as the satisfaction function upon the dynamics 


of society. For each choice of s4(z, y) or 83(2, y) and of the g’s we 


have a definite mathematical problem, and the different problems thus 
obtained should be studied separately. 
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MAXIMUM LIKELIHOOD ESTIMATION AND 
FACTOR ANALYSIS 


GALE YOUNG 
OLIVET COLLEGE 


Fisher’s method of maximum likelihood is applied to the prob- 
lem of estimation in factor analysis, as initiated by Lawley, and 
found to lead to a generalization of the Eckart matrix approximation 
problem. The solution of this in a special case is applied to show 
how test fallability enters into factor determination, it being noted 
that the method of communalities underestimates the number of fac- 
tors. 


I 
Any observed score s may be regarded as the sum of a “true” 
score ¢t and an “error” score x drawn from an error population with a 
distribution function p(x). A common assumption in test theory is 
that the errors are normally distributed, i.e., that 


(1) 


where h = 1/c\/2, o being the standard deviation. 

Similarly, in factor analysis an observed score matrix S = (si;) 
may be regarded as arising from a true score matrix T = (ti;) of 
lower rank r through the operation of independent random errors 
“;;. There then arises the problem of estimating T from S, and Law- 
ley (6) has recently suggested doing this by means of Fisher’s meth- 
od of maximum likelihood,* which is useful in the construction of es- 
timates with desirable properties( 2, 3, 4). The method consists in 
forming the probability density of occurrence of S from an arbitrary 
T, and then taking as an estimate that value of JT which maximizes 
this density. With errors distributed as in (1), we are to maximize 


L= hy; e- (2) 
i,j 
subject to T= S — X being of rank 7. This is equivalent to maximiz- 


*Dr. George Brown of Princeton University has independently made the 
same suggestion in some unpublished work. 
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ing log L , and the problem thus reduces to that of minimizing 
1,7 


If the dispersions o;; of the error populations are equal, this is 
the same as the earlier “matrix approximation” procedure of Eckart 
(1) ; otherwise Eckart’s problem is here generalized by the appear- 
ance of weighting factors h;; for the residuals. 


II 


A solution of the general problem represented by (3) is not 
known to the present writer, but if the weighting matrix H = (hj;) 
is of rank one the problem may be readily reduced to the unweighted 
Eckart problem, the solution of which is known. 

If H is of rank one its elements are of the form hi; = a; b;. If 
any element of H were zero, so then would be all others in either the 
same row or in the same column, and this row or column would drop 
out in (2) leaving a smaller matrix to work with. Hence if we as- 
sume H to be of rank one it is no further loss of generality to sup- 
pose its elements all non-zero. 

Let the subscript 1 on a matrix denote the matrix obtained by 
weighting it with H, i.e., by multiplying each element by the corre- 
sponding element of H. Thus X, is the matrix whose elements are 
squared and added in (3). For H as specified above we have 


X,—AXB 
S,—ASB (4) 
T,—ATB, 


where A and B are non-singular diagonal matrices with elements 4; 
and b;. Therefore the weighting transformation does not in this case 
alter the rank of a matrix, and neither does the “unweighting” or in- 
verse transformation given by X = A-'X, B", etc. Then the solution 
of the weighted approximation problem (3) is obtained by weighting 
S to get S,, finding the best rank r unweighted approximation T, to 
S, , and then unweighting T, to get T. For if T* were a better rank r 
weighted approximation to S than is the matrix T thus obtained, then 
T*, would be a better rank r unweighted approximation to S, than is 
T, because the quantity to be minimized is the same in either prob- 
lem. ‘ 

This proof does not hold for an arbitrary H , since weighting a 
matrix may then change its rank. 
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In practice one does not usually have data on the o;i;, the dis- 
persions of the various individuals 7 in the various tests 7. If the 
test reliabilities 7; are available, a rough estimate of the oi; might 
be obtained by neglecting the differences from individual to indi- 
vidual and considering o;; = o; to depend only upon the tests. Then, 
denoting the square of the length of the observed test vector by 

N 


1? = 5 si;? , one might take N o;? = (1 — 7;) 1? , or some similar 
ja 

expression, and thus obtain values for the o;. This leads to a rank 
one weighting matrix H, and thus comes under the case handled 
above. The same is true if o;; is the product of a quantity measuring 
the “unreliability” of the test and a quantity measuring the “unre- 
liability” of the person: if values for oi; were actually known, one 
might take the best rank one approximation to the matrix (0i;) to 
get such a form. 

The rough o values obtained above give weighting factors 

hy (5) 
and it is then seen that in weighting S to get the matrix S, upon 
which the actual work of unweighted approximation is performed 
we are in effect normalizing the test vectors to unit length, and then 
stretching them by a factor of 1/\/1 — r;. Thus tests with lower re- 
liabilities come out to have shorter vectors in the S, matrix. 

This is reminiscent of the familiar “communalities” procedure, 
wherein the test vectors are normalized to unit length and then short- 
ened by an amount corresponding to the error variance; in the pres- 
ent case the decrease would be to length \/7;. There is an important 
difference, however, in that the present process does not change the 
direction of the test vectors* and hence, over a considerable range of 
reliability, does not appreciably affect the dimensionality of the con- 
figuration, whereas the communality shortening is done without cor- 
responding reduction in the off-diagonal scalar product elements of 
the correlation matrix and hence swings the test vectors all in to- 
ward each other, thus artificially increasing the clustering (7) and 
underestimating the number of factors. As an example, if two vec- 
tors making an angle of 60° are shortened to 0.707 of their original 
lengths, while their scalar product is kept unaltered, they will be 


*In particular, if the tests have equal reliabilities, factoring the correlation 
matrix with unity in the diagonals is indicated. 
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swung completely into line. Similarly, a length reduction to 0.84 will 
align vectors originally 45° apart, and a reduction to 0.93 is sufficient 


to align vectors which were 30° apart. 


IV 


In this connection it may be noted that the practice of subtract- 
ing the error variance from the test variance is based essentially 
upon identifying a single observed value for the square of the length 
of the test vector with its expected value in an infinite sample. As a 
simple illustration, with errors distributed as in (1), the expected 
value of s? is given by 


E(s?) =@#@ +0’. (6) 


If this be equated to the square of an observed value s, the re- 
sult is 


t? = 8)? — o? (7) 
which corresponds to the communality procedure. 
On the other hand the method of maximum likelihood yields as 
the best estimate of # simply s,? itself. 


(Section added in response to comments by a referee). A num- 
ber of writers on the factor problem, e.g., Lawley, have formulated 
it as if different individuals taking a set of k tests were merely draw- 
ing samples from the same k-way distribution of variables in normal 
correlation. Such a distribution is specified by the means and vari- 
ances of each test and the covariances of the tests in pairs; it has no 
parameters distinguishing different individuals. Such a formulation 
is therefore inappropriate for factor analysis, where factor loadings 
of the tests and of the individuals enter in symmetric fashion in a 
bi-linear form. It would perhaps be more suitable for psychophysics, 
where the differences between individuals are ignored and attention 
is focused on the test objects presented to them. 

The point is that in factor analysis different individuals are re- 
garded as drawing their scores from different k-way distributions, 
and in these distributions the mean for each test is the true score of 
the individual on that test. Nothing is implied about the distribution 
of observed scores over a population of individuals, and one makes 
assumptions only about the error distributions. In Section I above 
the errors were supposed to be normally distributed and uncorre- 
lated; other assumptions would lead to expressions other than (2) 
for the likelihood function. 
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In section II the elements of the 7 matrix are obtained as cer- 
tain functions of the elements of the S matrix, the nature of which 


may be concisely indicated by 
7. (8) 


where (5) the matrix RF involves the characteristic vectors of the 
matrix S,S’,. Under the present assumption that the elements of S 
and hence those of S; are distributed normally, it is seen that those of 
T, and T have a more complicated distribution because the elements 
of R themselves depend on S. To a first approximation, however, we 
may neglect this last dependence and thus have the ¢;; as linear func- 
tions of the s;;, with coefficients which become known when the cal- 
culations of section II have been carried out on an observed score 
matrix S. To this approximation the elements of 7 are distributed 
normally with variances expressed in terms of the oj; of the error 
distributions and the coefficients of the just-mentioned linear func- 
tions. One can then state the probable error of the ¢;; estimates or 
proceed further to the construction of fiducial confidence belts. 

The writer wishes to express his thanks to Dr. A. S. Household- 
er for helpful discussion of this problem. 

This work was aided in part by a grant from the Dr. Wallace C. 
and Clara A. Abbott Memorial Fund of the University of Chicago. 
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A MACHINE METHOD FOR COMPUTING THE BISERIAL 
CORRELATION COEFFICIENT IN ITEM VALIDATION 


ELMER B. ROYER* 


A method for computing the biserial correlation coefficients with 
the aid of punch card equipment is oultined. A numerical example 
and a work sheet layout is included in the presentation. 


Pearson gives the following formula for biserial correlation, 


M, — 
Toiserial 2 ’ (1) 


where M; = Mean of the total distribution of the Y-variable 


M,= Mean in the Y-variable of category-one of the 
dichotomized or X-variable 


or = Standard deviation of the total distribution of 
the Y-variable 


p = Proportion of cases included in category-one of 
the X-variable 


z= The ordinate of the normal curve at the point 
of dichotomy. 


The Pearson formula is well suited to the computation of biserial 
correlations in item validation, since the mean and standard deviation 
of the criterion variable need be computed only once for computing 
any number of item validity coefficients. In this case a slight change 
yields an algebraically equivalent formula suited to item validation 
by punched card treatment: 


No: 


Where Sy, = Sum of the criterion scores of those passing the 
item 
* This method was developed by Dr. Elmer B. Royer prior to his untimely 


death on April 3, 1939. It has been prepared for publication by some of his 
former associates, not only as a contribution to science but also as a tribute to 


the memory of a true scientist. 
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p = Proportion of cases passing the item = N 


SY; = Sum of criterion scores of the total group 


N = Number of cases in the total group 
(or and z have the same meanings as above, 
and N, is the number of persons respond- 
ing successfully to the item.) 


Items are scored: Fail = 0; Pass = 1. 


1 
In the above expression, the term V obviously is a constant in 
or 


the computation of the validity coefficients for a given population and 
criterion. 


Coding the Data 


Assuming that 150 two-response items have been given to 298 
subjects, the correct responses may be punched in 50 columns of an 
80-column card. The coding may be set up somewhat in this fashion: 


Cols. 1 to 10 — Identifying material. 
“ 11 and 12 — Criterion scores, (Y). 


“ 13 to62 — Item responses. Punches in row I indicate 
successful responses only for items 1 to 50; 
in row 3, for items 51 to 100; in row 5 for 
items 101 to 150. Thus, if 1 is punched in 
column 14, it indicates that item 2 has been 
correctly answered ; and if 3 is not punched 
in column 15, it indicates that item 53 has 
been answered incorrectly, etc. The coding 
of item responses may vary with the num- 
ber of items and responses. The only re- 
quisite is that it be easily possible to sort 
out the group making a particular re- 
sponse and subsequently to tabulate the 
criterion scores of that group. 

Cols. 63, 64, . 

65, and 66 — The reciprocal of N, in this case .003356, 

which is punched without the initial zeros. 


Col. 67 — Punch “1” in each card for use in obtain- 
ing an item count. ; 


q 
; 
5 
| 
: 


ELMER B. ROYER - 57 


TABULATION — Steps in finding SY; and o: 

Assuming that there are scores of large magnitude in the cri- 
terion variable, the digiting method of obtaining the summations (im- 
plying that Y = 10a + b, where a = tens-digit and b = units-digit) is 


employed. 

1. Sort on units column of criterion scores, and arrange cards 
in descending order of value. 

2. Set tabulating machine to control on units column and to add 
and cumulate criterion scores. Print Y-units column in the 
list bank and print the item count and criterion scores as 
progressive totals. (See tabulation below.) 

3. Sum the cumulated Y-scores column, omitting the zero-class 
entries. Enter this number, 56996, on the work sheet where it 
reads “> units.” 

4. The highest value in the cumulated scores column will be 
SY,. Enter this number on the work sheet in the place des- 
ignated. 

5. Repeat steps 1, 2, and 3, sorting instead on the tens-column 
of criterion scores. 

6. Enter “DS tens” on work sheet. Check SY;. 

7. Enter N on work sheet. 

The tabulation sheets will appear as follows : 

Units Cumulated 

Column of Y (Item-Count) Y Scores 
Cumulated Frequency 

9 25 1165 

8 53 2799 

T 86 4230 

6 119 5708 

5 147 6448 

4 171 7494 

3 209 8798 

2 239 9768 

1 267 1316=f(units) 10586 56996 “> units” 
0 298-—=N 11446—*Y, 


: 
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Tens 7 Cumulated 
Column of Y (Item Count) Y Scores 
9 46 4381 
8 17 7717 
(7)* 
6 125 9091 
5 109 9256 
4 110 9298 
3 116 9480 
2 143 10116 
i 210 1013—2f (tens) 11034 78090 “2 tens” 
0 298 N 11446 — =Y,** 


*—Since there are no 7’s, the respective extensions of the 8-row must be added twice as indicated 
by the parentheses. 
**_Check: 103/ (tens) + (units) 
11446 = 10 (1013) + 1316 


All the values needed to compute the constant 1/Nor7 are now 
known. The operations yielding this value are indicated on the work 
sheet. 

To obtain 7,;, for any item, sort out the cards of those persons 
passing the item. Place these in the hopper of the tabulator and set 
the machine to add on the criterion scores, the reciprocal of N, and 
the item count. The tabulation of five items would appear: 


Item Item Count ZY, p 
1 148 4401 496688 
2 106 5130 855736 
3 223 9421 748388 
4 78 2900 261768 
5 26 1160 087256 


Enter SY, and p on the work sheet, and find z in the Kelly-Wood 
tables and record. Follow the operations indicated by the work sheet. 
The computation of 7,;, for the five items above is shown on the work 
sheet. 


Op 
xt 
N. 
: N: 
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WORK SHEET 
Computation of 
1 
BIS No, Item 
Computation of Constant 
Operation Result 
298 N 2 51380 .8557 
> tens 780,900 4 2900 .2618 
> hundred. 00 
— 5 1160 .0873 
Add 837,896 6 
249,693,008 NZ=Y2, | 7 
LY, 11446 zY,| 8 
2Y>p 131,010,916 (2Y,)2 
9 
N3Y,?— 
(2Y,)? 118,682,092 (No,)? | 10 
V (No)? 10894.1 No, 
1/No, 000091793 Constant | 11 
Multiplier | 49 
38.41 M, | 38 
No, 14 
36.56 op 
N 15 
*Check on item 1 16 
computations 
— = (1—p)2Yr — — XYp) 
— 1284,228 = (1 — .4967)11446 — gue — 4401) | 17 
— 1284,228 == .5033 X 11446 — .704 
— 1284,228 = — 1284.228 18 
19 
20 
21 
REFERENCES 


-1284.228* -3219.423 -.296 
8726 1058.658  2841.272 .261 
8189 854.814  2680.508 .246 
8255 96.563 296.661 -.027 
160.764  1013.006 .093 


1. Dunlap, J. W. Computation of descriptive statistics. New York: Ralph C. 
Coxhead Corporation, 1937, 78-81. 

2. Dunlap, J. W. and McNamara, W. J. A graphical method for computing the 
standard error of biserial r. J. exper. Educ., 1934, 2, 274-277. 

8. Pearson, K. On a new method of determining correlation between a measured 
character A and a character B, of which only the percentage of cases wherein 
B exceeds of falls short of a given intensity is recorded for each grade of A. 


Biometrika, 1909, 7, 96-105. 
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MUSIC ABILITY 


J. E. KARLIN 


UNIVERSITY OF CHICAGO 


Two batteries of music tests were factored by the centroid meth- 
od. From each battery three oblique factors were extracted and in 
each case were tentatively identified as tonal sensitivity, retentivity 
(memory for elements), and memory for form. The correlations of 
the music tests of one battery with subtests of Cattell’s intelligence 
test and with tests of a literary nature are also reported. 


It has been held that musical ability is some dominantly unitary 
feature of mental life incapable of analysis into simpler components 
by rational methods. This study is a preliminary investigation of the 
music field by Thurstone’s method of multiple factor analysis. 

An account is given of an analysis of two batteries of music tests 
typical of the music test literature. The first battery was assembled 
by the present writer,* ‘the second by Drake.} The battery given by 
the writer to 120 undergraduate students in the University of Cape 
Town (South Africa) consisted of 19 tests, as follows: 

Tests 1 to 6 were the six parts of R. B. Cattell’s intelligence test, 
scale III; tests 7 to 9 inclusive were of a literary nature; tests 10 to 
19 were chosen as music tests. The origin of each music test is indi- 
cated in Table 1, those without any specific indication having been 
devised either by the writer or by members of the faculty of music 
in the University of Cape Town after the pattern of the Seashore tests 
which were themselves deemed too difficult for the population to be 
tested. It was hoped that the inclusion of intelligence tests in the 
same battery as the music tests would throw light on the much de- 
bated question of cognition in music. Similarly, the literary tests 
could perhaps provide the first steps towards possible evidence of a 
general artistic factor. The table of intercorrelations, computed as 
Pearson product-moment coefficients, is reproduced in Table 1 and is 
immediately seen to be highly informative on both the foregoing 
points. While the literary and intelligence tests correlate highly with 
each other and among themselves, of the 90 intercorrelations between 


* Karlin, J. E. A multiple factor analysis of musicality. M. A. thesis, Uni- 
versity of Cape Town, 1939. 
_. t Drake, R. M. A factorial analysis of music tests by the Spearman tetrad- 
difference technique. J. Musicology, 1989, 1, 1. 
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the music tests and the rest of the battery, only 25 were as high as 


.10 and the mean correlation was only .05. 


TABLE I 
512 
409.319 


+282 =.450 

-846 .269 .411 .413 

-299 .856 .391 .493 .333 

-391 .264 .285 .336 .437 .265 

-564 .270 .493 .476 .502 .451 .476 

896 .853 .400 .429 .430 .354 .432 .562 

-.056 -.069 .025 -.052 -.096 -.099 .171 -.005 -.084 

-072 .096 .018 -.017 .022 -.030 .059 .026 -.072 .060 

-.114 -.055 -.049 —.059 -—.025 -.073 .020 -.025 .081 .343 .122 

-.110 .028 -.130 ~.061 -.020 -.034 -.034 -.002 -.075 .061 .001 .095 

-188 .204 .098 .163 .094 .206 .044 .281 .095 .110 .072 .210 .110 
-.091 -.042 .069 .130 -.124 -.014 .034 -064 .004 .859 .203 .489 .062 .228 
-088 .057 .067 .096 .023 -.036 -.060 .137 .049 .269 .250 .267 .191 .331 
-161 .059 .016 .018 .124 .131 .141 -.084 .029 .227 .262 .389 -.011 .322 
-062 .285 -.079 .007 -.079 -.082 -.120 -.002 -.105 -.009 .0938 .067 .059 .002 


066 .016 .127 .169 .152 .080 .168 .3824 -285 .176 .032 .091 -.005 .218 
1. Synonyms 8. Vocabulary 14, 
2. Classification 9. Poetical appreciation 15. 
3. Opposites 10. Pitch discrimination 16. 
4. Analogies 11. Tonal memory . 
5. Completion of sentences 12. Interval discrimination 18. 
6. Inferences 13. Consonance 19. 
7. 


Reading comprehension 


-592 

+342 .510 

-056 —.025 -.056 

-180 .326 .231 -.023 


Rhythm 

Time 

Musical Memory (Drake) 
Retentivity (Drake) 
Intensity (Seashore) 
Emotional sensitivity 


It might appear, then, that musical ability pertains largely to a field 
of its own. Yet it may be unwise to conclude this too hastily. With 
the isolation of the primary mental abilities by Thurstone, the concept 
of general intelligence or g , as Spearman put it, is becoming less and 
less widely accepted as meaningful in any unitary sense. It may be, 
however, that there is over-lap between the more elemental compo- 
nents of intelligence and fundamental abilities peculiar to the music 
domain. Likewise the lack of correlation between the music tests and 
the literary tests indicates the closeness of identity of the verbal fac- 
tor with some aspect of general intelligence. 

The battery was accordingly split into the musical and non-musi- 
cal halves, and the music battery of ten tests, that is, tests 10 to 19 
inclusive, was factored by the centroid method. After three factors 
were extracted, the median residual coefficient was .034 and the me- 
dian probable error corrected fer attenuation .035, and the highest 
residual was .091. The residue was therefore deemed unsystematic. 
Table 2 shows the rotation of the three centroid vectors to the three 
primary vectors of the present system in accordance with the de- 
mands of simple structure. Table 3 gives the correlations of the pri- 
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Z Retentivity 


Tonal Sensitivity 


10 


Be z Memory 
FIGURE 1 
maries. Figure 1 is a pictorial representation of the trait configura- 
tion in relation to the primary vectors as derived by the method of 
extended vectors*. 


TABLE 2 
F. F, 
I Il lll x Y Z 
10. .448 —.281 .148 —.002 
11. .073 —.169 048 013 
12. .564 —.347 .190 A 004 ~—-.005 
“x Y Z O17 214 


14. 443 241 150 .287 333 406 398 —.057 
15. 697 —.246 —.187 —.933 —.096 437 414 | 
16. -732 291 —.243 573 133 —.901 —.060 561 | 


17. 625 .237 .236 A9T .018 
18. .059 —110 —.177 017 ~=.167 
19. 814 150 .197 318 —.009 —.027 
TABLE 3 

A’A 

xX ¥ Z 

X 1.000 

Y —.545 1.000 


Z —.282 112 1.000 


Two observations are called for: 

1. Test 18 (Intensity) is omitted from consideration since it showed 
negligible correlation with the rest of the battery almost through- 
out, having a communality of .047. 

2. The condition of a positive manifold is fulfilled. 


* Thurstone, L. L. A new rotational method in factor analysis. Psycho- 
metrika, 1988, 3, 199-218. 


J. E. KARLIN 63 
i 

19 i 

15° 
"16 


64 PSYCHOMETRIKA 


The only claim made regarding musical ability in this battery is 
that the variance -present in the tests, from .10 to .68, is plausibly 
explained here in terms of three factors. It will be necessary to devise 
tests which will have much higher communalities before it becomes 
evident how many psychological factors are involved in any such bat- 
tery. It is unlikely that musical ability in general can be reduced to 
only three functional unities: With such small batteries, the insuffi- 
ciency of data allows only of a somewhat vague structure; although 
the general character of the three factors can be seen with reasonable 
assurance in that the trait configuration did functionally outline a 
three-dimensional simple structure, it is necessary to have many more 
tests in order that the positions of the planes may be determined. 
With further study, the number of planes will be more exactly defined 
so as to give a multi-dimensional system, which may or may not be 
orthogonal. The present system is oblique, but it may well be that as 
the parameters become overdetermined by test data the planes will 
define themselves as being orthogonal. In the present case, factor Y 
appears to be some sort of tonal sensitivity factor, having its greatest 
weight on tests 10 and 12; factor X seems to be a retentivity or mem- 
ory for elements factor with highest load on test 17; factor Z is a 
memory for form factor with maximal saturation in test 16. The two 
memory factors are obscure in outline apart from their retentive na- 
ture. It is imperative that future work be directed towards devising 
many further tests which will serve to accentuate the planes in general 
and the corners of the structure in particular. 

A very similar procedure was adopted for a reanalysis of music 
test data assembled by Drake. His table of raw coefficients is repro- 
duced in Table 4. 


TABLE 4 
1 2 3 4 5 6 7 8 


Musical memory 1 

Pitch 2. .466 

Retentivity 8. .456 .311 

Rhythm 4. .441 .296 .185 

Intensity 5. .875 .521 .184 .176 

Time 6. .812 .286 .800 .244 .389 

Tonal movement %. 483 878 211° .210 
Tonal memory 8. .207 .814 .878 .841 .1538 .289 .504 


Again three factors were all that could be extracted, with the median 
residual coefficient .036 and the median probable error corrected for 
attenuation .029. With only 8 tests, a fourth factor is not justified. 
The rotation of the axes is shown in Table 5 and the correlation of 
the primaries is given by Table 6. ee 
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TABLE 5 
F, A F, 

I Ill x Y Z 
1 —.258 —.202 510 .073 —.039 
2 692 —1388 346 X Y Z 048 .186 
3 .573 —198  .480 328 «427 —.048 
4 —169 —323 —.096 —203 .985 .531 —.108 —.028 
5 —.369 .283 —.873 926 085 051 —.134 
6 —.100 —.075 326 120 
7 886 .274 000 
8 .3878 —.109 388 

TABLE 6 

A’A 

x Y Z 
X 1.000 


Y —.634 1.000 
Z —.001 —.154 1.000 


From Figure 2 it would appear that here too a three-dimensional 
oblique simple structure prevails. Factor Y looks very much like the 
tonal sensitivity factor already identified with highest load on test 2; 
factor X is a memory factor with heavy loads on tests 1, 3, and 4; fac- 
tor Z is probably the retentivity factor, it being most evident in tests 


7, 8, and 3. 


Tonal Sensitivity 


Retentivi ty 


FIGURE 2 


The same warnings must be given here as were appropriate in the 
previous analysis. The agreement between the results of the two 
analyses is promising for further and more extensive studies. Such 
studies are in progress at the present time. Their ultimate purpose 
is the isolation of the primary musical abilities. 
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